Amendments to the Claims: 

This listing of claims will replace all prior versions, and listings, of claims in the application: 
Listing of Claims: 

Claim 1 (currently amended): A method comprising: 

receiving a first program unit in a parallel computing environment, the first program unit 
including a reduction operation associated with a set of variables; 

translating the first program unit into a second program unitj the second program unit 
including to associate th e r e duction op e ration with a set of one or more instructions op e rativ e to 
partition the reduction operation between a plurality of threads including at least two threads and 
to reference a third program unit ; and 

translating the first program unit into [[a]] the third program unit, the third program unit 
to associate th e r e duction operation with including a set of one or more instructions op e rativ e 
that encapsulate the reduction operation to perform an algebraic operation on the variables. 

Claim 2 (cancel) 

Claim 3 (currently amended): The method of claim 1 further comprising reducing the set 
of variables logarithmically. 

Claims 4-5 (cancel) 

Claim 6 (original): The method of claim 1 further comprising associating the plurality 
of threads each with a unique portion of the set of variables. 

Claim 7 (original): The method of claim 6 further comprising combining, in part, the 
variables associated with the plurality of threads in a pair-wise reduction operation. 

Claim 8 (currently amended): An apparatus comprising: 

a memory including a shared memory location; 

a translation unit coupled with the memory, the translation unit to translate a first 
program unit including a reduction operation associated with a set of at least two variables into a 
second program unit, the second program unit to associate the reduction op e ration with on e or 
mor e instructions op e rativ e to partition the reduction operation between a plurality of threads 
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including at least two threads and to reference a third program unit , the translation unit to also 
translate the first program unit into [[a]] the third program unit, the third program unit te 
associat e th e r e duction op e ration with a set of on e or mor e instructions op e rative to encapsulate 
the reduction operation to perform an algebraic operation on the variables; 

a compiler unit coupled with the translation unit and the memory, the compiler unit to 
compile the second program unit and the third program unit; and 

a linker unit coupled with the compiler unit and the memory, the linker unit to link the 
compiled second program unit and the compiled third program unit with a library. 

Claim 9 (cancel) 

Claim 10 (currently amended): The apparatus of claim 8 wherein the variables in 
the set of variables are each uniquely associated with the plurality of threads and the library 
includes instructions operativ e to combine, in part, the variables associated with the plurality of 
threads. 

Claim 1 1 (currently amended): The apparatus of claim 10 wherein the library 
includes instructions op e rative to combine, in part, the variables in a pair-wise reduction. 

Claim 12 (original): The apparatus of claim 8 further comprising a set of one or more 
processors to host the plurality of threads, the plurality of threads to execute instructions 
associated with the second program unit. 

Claim 13 (currently amended): The apparatus of claim 8 wherein the s e cond third 
program unit includes a callback routine and the callback routine is associated with instructions 
operative to perform [[an]] the algebraic operation on at least two variables in the set of 
variables. 

Claim 14 (original): The apparatus of claim 13 wherein the library is operative to call 
the callback routine to perform, in part, a reduction on at least two variables in the set of 
variables. 

Claim 15 (currently amended): A machine-readable medium that provides 
instructions, that when executed by a set of one or more processors, enable the set of processors 
to perform op e ration s a method comprising: 
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receiving a first program unit in a parallel computing environment, the first program unit 
including a reduction operation associated with a set of variables; 

translating the first program unit into a second program unit, the second program unit 
including to associat e th e r e duction op e ration with a set of one or more instructions operativ e to 
partition the reduction operation between a plurality of threads including at least two threads and 
to reference a third program unit ; and 

translating the first program unit into [[a]] the third program unit, the third program unit 
to associate th e r e duction op e ration with including a set of one or more instructions op e rative 
that encapsulate the reduction operation to perform an algebraic operation on the variables. 

Claims 16-17 (cancel) 

Claim 18 (currently amended): The machine-readable medium of claim 15 wherein 
the method further comprising comprises instructions for reducing the variables, in part, 
logarithmically. 

Claim 19 (cancel) 

Claim 20 (currently amended): The machine-readable medium of claim 15 wherein 
the instructions cause the second program unit to utilize, in part, the third program unit to 
perform [[a]] the reduction operation on the set of variables. 

Claim 21 (new): The method of claim 1, further comprising performing a plurality 
of reduction operations in the third program unit. 

Claim 22 (new): The method of claim 1, further comprising performing a vector 
reduction operation in the third program unit via a N-dimension loop in the third program unit. 

Claim 23 (new): The method of claim 1, further comprising using a run-time library 
to implement the reduction operation. 

Claim 24 (new): The apparatus of claim 8, wherein the third program unit is to 
perform a plurality of vector operations. 

Claim 25 (new): The apparatus of claim 8, wherein the third program unit is to 
perform a vector reduction operation via a N-dimension loop. 
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Claim 26 (new): The apparatus of claim 8, wherein the third program unit is to 
perform the algebraic operation using the library. 
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